智能论文笔记

NeSF: Neural Semantic Fields for Generalizable Semantic Segmentation of 3D Scenes

Suhani Vora , Noha Radwan , Klaus Greff , Henning Meyer , Kyle Genova , Mehdi S. M. Sajjadi , Etienne Pot , Andrea Tagliasacchi , Daniel Duckworth

分类：计算机视觉 | 机器人

2021-11-25

我们呈现NESF，一种用于单独从构成的RGB图像中生成3D语义场的方法。代替经典的3D表示，我们的方法在最近的基础上建立了隐式神经场景表示的工作，其中3D结构被点亮功能捕获。我们利用这种方法来恢复3D密度领域，我们然后在其中培训由构成的2D语义地图监督的3D语义分段模型。尽管仅在2D信号上培训，我们的方法能够从新颖的相机姿势生成3D一致的语义地图，并且可以在任意3D点查询。值得注意的是，NESF与产生密度场的任何方法兼容，并且随着密度场的质量改善，其精度可提高。我们的实证分析在复杂的实际呈现的合成场景中向竞争性2D和3D语义分割基线表现出可比的质量。我们的方法是第一个提供真正密集的3D场景分段，需要仅需要2D监督培训，并且不需要任何关于新颖场景的推论的语义输入。我们鼓励读者访问项目网站。

translated by 谷歌翻译

Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations

Mehdi S. M. Sajjadi , Henning Meyer , Etienne Pot , Urs Bergmann , Klaus Greff , Noha Radwan , Suhani Vora , Mario Lucic , Daniel Duckworth , Alexey Dosovitskiy

分类：计算机视觉 | 人工智能 | 机器学习 | 机器人

2021-11-25

计算机愿景中的经典问题是推断从几个可用于以交互式速率渲染新颖视图的图像的3D场景表示。以前的工作侧重于重建预定定义的3D表示，例如，纹理网格或隐式表示，例如隐式表示。辐射字段，并且通常需要输入图像，具有精确的相机姿势和每个新颖场景的长处理时间。在这项工作中，我们提出了场景表示变换器（SRT），一种方法，该方法处理新的区域的构成或未铺设的RGB图像，Infers Infers“设置 - 潜在场景表示”，并合成新颖的视图，全部在一个前馈中经过。为了计算场景表示，我们提出了视觉变压器的概括到图像组，实现全局信息集成，从而实现3D推理。一个有效的解码器变压器通过参加场景表示来参加光场以呈现新颖的视图。通过最大限度地减少新型视图重建错误，学习是通过最终到底的。我们表明，此方法在PSNR和Synthetic DataSets上的速度方面优于最近的基线，包括为纸张创建的新数据集。此外，我们展示了使用街景图像支持现实世界户外环境的交互式可视化和语义分割。

translated by 谷歌翻译

Multisensor Multiobject Tracking With High-Dimensional Object States

Wenyu Zhang , Florian Meyer

分类：机器人

2022-12-30

Passive monitoring of acoustic or radio sources has important applications in modern convenience, public safety, and surveillance. A key task in passive monitoring is multiobject tracking (MOT). This paper presents a Bayesian method for multisensor MOT for challenging tracking problems where the object states are high-dimensional, and the measurements follow a nonlinear model. Our method is developed in the framework of factor graphs and the sum-product algorithm (SPA). The multimodal probability density functions (pdfs) provided by the SPA are effectively represented by a Gaussian mixture model (GMM). To perform the operations of the SPA in high-dimensional spaces, we make use of Particle flow (PFL). Here, particles are migrated towards regions of high likelihood based on the solution of a partial differential equation. This makes it possible to obtain good object detection and tracking performance even in challenging multisensor MOT scenarios with single sensor measurements that have a lower dimension than the object positions. We perform a numerical evaluation in a passive acoustic monitoring scenario where multiple sources are tracked in 3-D from 1-D time-difference-of-arrival (TDOA) measurements provided by pairs of hydrophones. Our numerical results demonstrate favorable detection and estimation accuracy compared to state-of-the-art reference techniques.

translated by 谷歌翻译

Active Planning for Cooperative Localization: A Fisher Information Approach

Wenyu Zhang , Bryan Teague , Florian Meyer

分类：机器人

2022-12-30

Location-aware networks will introduce new services and applications for modern convenience, surveillance, and public safety. In this paper, we consider the problem of cooperative localization in a wireless network where the position of certain anchor nodes can be controlled. We introduce an active planning method that aims at moving the anchors such that the information gain of future measurements is maximized. In the control layer of the proposed method, control inputs are calculated by minimizing the traces of approximate inverse Bayesian Fisher information matrixes (FIMs). The estimation layer computes estimates of the agent states and provides Gaussian representations of marginal posteriors of agent positions to the control layer for approximate Bayesian FIM computations. Based on a cost function that accumulates Bayesian FIM contributions over a sliding window of discrete future timesteps, a receding horizon (RH) control is performed. Approximations that make it possible to solve the resulting tree-search problem efficiently are also discussed. A numerical case study demonstrates the intelligent behavior of a single controlled anchor in a 3-D scenario and the resulting significantly improved localization accuracy.

translated by 谷歌翻译

BD-KD: Balancing the Divergences for Online Knowledge Distillation

Ibtihel Amara , Nazanin Sepahvand , Brett H. Meyer , Warren J. Gross , James J. Clark

分类：计算机视觉

2022-12-25

Knowledge distillation (KD) has gained a lot of attention in the field of model compression for edge devices thanks to its effectiveness in compressing large powerful networks into smaller lower-capacity models. Online distillation, in which both the teacher and the student are learning collaboratively, has also gained much interest due to its ability to improve on the performance of the networks involved. The Kullback-Leibler (KL) divergence ensures the proper knowledge transfer between the teacher and student. However, most online KD techniques present some bottlenecks under the network capacity gap. By cooperatively and simultaneously training, the models the KL distance becomes incapable of properly minimizing the teacher's and student's distributions. Alongside accuracy, critical edge device applications are in need of well-calibrated compact networks. Confidence calibration provides a sensible way of getting trustworthy predictions. We propose BD-KD: Balancing of Divergences for online Knowledge Distillation. We show that adaptively balancing between the reverse and forward divergences shifts the focus of the training strategy to the compact student network without limiting the teacher network's learning process. We demonstrate that, by performing this balancing design at the level of the student distillation loss, we improve upon both performance accuracy and calibration of the compact student network. We conducted extensive experiments using a variety of network architectures and show improvements on multiple datasets including CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet. We illustrate the effectiveness of our approach through comprehensive comparisons and ablations with current state-of-the-art online and offline KD techniques.

translated by 谷歌翻译

Claim Optimization in Computational Argumentation

Gabriella Skitalinskaya , Maximilian Spliethöver , Henning Wachsmuth

分类：自然语言处理

2022-12-17

An optimal delivery of arguments is key to persuasion in any debate, both for humans and for AI systems. This requires the use of clear and fluent claims relevant to the given debate. Prior work has studied the automatic assessment of argument quality extensively. Yet, no approach actually improves the quality so far. Our work is the first step towards filling this gap. We propose the task of claim optimization: to rewrite argumentative claims to optimize their delivery. As an initial approach, we first generate a candidate set of optimized claims using a sequence-to-sequence model, such as BART, while taking into account contextual information. Our key idea is then to rerank generated candidates with respect to different quality metrics to find the best optimization. In automatic and human evaluation, we outperform different reranking baselines on an English corpus, improving 60% of all claims (worsening 16% only). Follow-up analyses reveal that, beyond copy editing, our approach often specifies claims with details, whereas it adds less evidence than humans do. Moreover, its capabilities generalize well to other domains, such as instructional texts.

translated by 谷歌翻译

Neural Enhanced Belief Propagation for Multiobject Tracking

Mingchao Liang , Florian Meyer

分类：计算机视觉 | 人工智能 | 机器学习

2022-12-16

Algorithmic solutions for multi-object tracking (MOT) are a key enabler for applications in autonomous navigation and applied ocean sciences. State-of-the-art MOT methods fully rely on a statistical model and typically use preprocessed sensor data as measurements. In particular, measurements are produced by a detector that extracts potential object locations from the raw sensor data collected for a discrete time step. This preparatory processing step reduces data flow and computational complexity but may result in a loss of information. State-of-the-art Bayesian MOT methods that are based on belief propagation (BP) systematically exploit graph structures of the statistical model to reduce computational complexity and improve scalability. However, as a fully model-based approach, BP can only provide suboptimal estimates when there is a mismatch between the statistical model and the true data-generating process. Existing BP-based MOT methods can further only make use of preprocessed measurements. In this paper, we introduce a variant of BP that combines model-based with data-driven MOT. The proposed neural enhanced belief propagation (NEBP) method complements the statistical model of BP by information learned from raw sensor data. This approach conjectures that the learned information can reduce model mismatch and thus improve data association and false alarm rejection. Our NEBP method improves tracking performance compared to model-based methods. At the same time, it inherits the advantages of BP-based MOT, i.e., it scales only quadratically in the number of objects, and it can thus generate and maintain a large number of object tracks. We evaluate the performance of our NEBP approach for MOT on the nuScenes autonomous driving dataset and demonstrate that it has state-of-the-art performance.

translated by 谷歌翻译

Physics-Informed Neural Networks for Material Model Calibration from Full-Field Displacement Data

David Anton , Henning Wessels

分类：机器学习

2022-12-15

The identification of material parameters occurring in constitutive models has a wide range of applications in practice. One of these applications is the monitoring and assessment of the actual condition of infrastructure buildings, as the material parameters directly reflect the resistance of the structures to external impacts. Physics-informed neural networks (PINNs) have recently emerged as a suitable method for solving inverse problems. The advantages of this method are a straightforward inclusion of observation data. Unlike grid-based methods, such as the finite element method updating (FEMU) approach, no computational grid and no interpolation of the data is required. In the current work, we aim to further develop PINNs towards the calibration of the linear-elastic constitutive model from full-field displacement and global force data in a realistic regime. We show that normalization and conditioning of the optimization problem play a crucial role in this process. Therefore, among others, we identify the material parameters for initial estimates and balance the individual terms in the loss function. In order to reduce the dependence of the identified material parameters on local errors in the displacement approximation, we base the identification not on the stress boundary conditions but instead on the global balance of internal and external work. In addition, we found that we get a better posed inverse problem if we reformulate it in terms of bulk and shear modulus instead of Young's modulus and Poisson's ratio. We demonstrate that the enhanced PINNs are capable of identifying material parameters from both experimental one-dimensional data and synthetic full-field displacement data in a realistic regime. Since displacement data measured by, e.g., a digital image correlation (DIC) system is noisy, we additionally investigate the robustness of the method to different levels of noise.

translated by 谷歌翻译

MIST: a Large-Scale Annotated Resource and Neural Models for Functions of Modal Verbs in English Scientific Text

Sophie Henning , Nicole Macher , Stefan Grünewald , Annemarie Friedrich

分类：自然语言处理 | 人工智能

2022-12-14

Modal verbs (e.g., "can", "should", or "must") occur highly frequently in scientific articles. Decoding their function is not straightforward: they are often used for hedging, but they may also denote abilities and restrictions. Understanding their meaning is important for various NLP tasks such as writing assistance or accurate information extraction from scientific text. To foster research on the usage of modals in this genre, we introduce the MIST (Modals In Scientific Text) dataset, which contains 3737 modal instances in five scientific domains annotated for their semantic, pragmatic, or rhetorical function. We systematically evaluate a set of competitive neural architectures on MIST. Transfer experiments reveal that leveraging non-scientific data is of limited benefit for modeling the distinctions in MIST. Our corpus analysis provides evidence that scientific communities differ in their usage of modal verbs, yet, classifiers trained on scientific data generalize to some extent to unseen scientific domains.

translated by 谷歌翻译

Domain Adaptation of Transformer-Based Models using Unlabeled Data for Relevance and Polarity Classification of German Customer Feedback

Ahmad Idrissi-Yaghir , Henning Schäfer , Nadja Bauer , Christoph M. Friedrich

分类：自然语言处理 | 机器学习

2022-12-12

Understanding customer feedback is becoming a necessity for companies to identify problems and improve their products and services. Text classification and sentiment analysis can play a major role in analyzing this data by using a variety of machine and deep learning approaches. In this work, different transformer-based models are utilized to explore how efficient these models are when working with a German customer feedback dataset. In addition, these pre-trained models are further analyzed to determine if adapting them to a specific domain using unlabeled data can yield better results than off-the-shelf pre-trained models. To evaluate the models, two downstream tasks from the GermEval 2017 are considered. The experimental results show that transformer-based models can reach significant improvements compared to a fastText baseline and outperform the published scores and previous models. For the subtask Relevance Classification, the best models achieve a micro-averaged $F1$-Score of 96.1 % on the first test set and 95.9 % on the second one, and a score of 85.1 % and 85.3 % for the subtask Polarity Classification.

translated by 谷歌翻译